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Abstract 

The performance of sparse signal recovery can be improved if both sparsity and correla- 
tion structure of signals can be exploited. One typical correlation structure is intra-block 
^ \ correlation in block sparse signals. To exploit this structure, a framework, called block 
sparse Bayesian learning (BSBL) framework, has been proposed recently. Algorithms de- 
rived from this framework showed superior performance but they are not very fast, which 
limits their applications. This work derives an efficient algorithm from this framework, using 
a marginalized likelihood maximization method. Compared to existing BSBL algorithms, it 
has close recovery performance but is much faster. Therefore, it is more suitable for large 
scale datasets and applications requiring real-time implementation. 
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1. Introduction 

Sparse signal recovery and the associated compressed sensing [l] can recover a signal 

H ■ 

with small number of measurements with high probability of successes (or sufficient small 
errors), given that the signal is sparse or can be sparsely represented in some domain. It 
has been found that exploiting structure information of a signal can further improve the 
recovery performance. In practice, a signal generally has rich structures. One structure is 
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the block/group sparse structure , which refers to the case when nonzero entries 

of a signal all cluster around some locations. Existing algorithms exploit such information 
showed improved recovery performance. 

Recently, noticing intra-block correlation widely exists in real-world signals, Zhang and 
Rao j3, E| proposed the block sparse Bayesian learning (BSBL) framework. A number of 
algorithms have been derived from this framework, and showed superior ability to recover 
block sparse signals or even non-sparse signals [7]. But these BSBL algorithms are not fast, 
and thus cannot be applied to large-scale datasets. 

In this work, we propose a BSBL algorithm using a fast marginalized likelihood maxi- 
mization (FMLM) method [8|. Thanks to the BSBL framework, it can exploit both block 
structure and intra-block correlation. Experiments conducted on both synthetic data and 
real ECG data showed that the proposed algorithm significantly outperforms traditional al- 
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gorithms which only exploit block structure such as Model-CoSaMP[4] and Block-OMP 9], 
and has similar recovery accuracy as BSBL algorithms j^J. However, it is much faster than 
the BSBL algorithms [2] and thus is more suitable for large scale problems. 
Throughout the paper, Bold symbols are reserved for vectors and matrices. 

2. The Block Sparse Bayesian Learning Framework 

A block sparse signal x has the following structure: 

X = [ xi, • ■■ , X dl , ■■■ , Zi, • ■ ■ ; x d a fi (!) 

X T yT 
X l X 9 

which means x has g blocks, and only a few blocks are nonzero. Here di is the block size for 
the zth block. To model the intra-block correlation in the i-th block, the BSBL framework 
suggests to use the parameterized Gaussian distribution: 

pfa Bi) = ATfe; 0, 7^). (2) 



with unknown deterministic parameters ji and Bj. Although blocks have intra-block corre- 
lation, the framework assumes that blocks are mutually independent. The observation y is 
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obtained by 

y = $x + n, (3) 

where $ is a M x N sensing matrix and n is the observation noise. The observation noise 
is assumed to be independent and Gaussian with zero mean and variance equal to is 
also unknown. Thus the likelihood is given by 

p(y|x;/3)=AT($x,/3- 1 I). (4) 

The posterior can be calculated as p(x|y; {7$, Bj} i; (3) = A/"(/x, S) with £ = (r~ 1 + <fr T /3<fr)~ 1 
and /x = E<fr T /3y, where T denotes a block diagonal matrix with the ith principal block given 
by 7jBj. Once all the parameters, namely {7$, Bj}j, /3, are estimated, the MAP estimate of 
the signal x can be directly obtained from the mean of the posterior, i.e., 

x = S$ T /3y. (5) 

To estimate the parameters, the following cost function is generally used, which is derived 
from the Type II maximum likelihood 

£({ 7 ,,B l } J ,/3) = log|C|+y T C- 1 y, (6) 

where C = ft- 1 ! + $T$ T . There are several methods to minimize the cost function |2j. In 
the following we consider to use the marginalized likelihood maximization method, which 
was used by Tipping et al. |8| for their basic SBL algorithm and later was used by Ji et al. 
fiol ] for their Bayesian compressive sensing algorithm. 

3. The Proposed Fast Block SBL Algorithm 

3.1. The Main-body of the Algorithm 

The cost function (jH]) can be optimized in a block way. We denote by the ith block 
in $ with the column indexes corresponding to the ith. block of the signal x. Then C can 
be rewritten as: 

C = /T *I + *m7 m B m $l + *i7<Bi*f , (7) 



T 
i j 



C _i + $i7iBi$ 
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where C_j = (5 l \ + Y^ m +i *m7m B m $ I- Using the Woodbury Identity, 

|C| = biBillCUllA^ + Sil, (9) 

c- 1 = c:, 1 - cr^V + s^Jczl (io) 

where Aj = 7jBj, = $>JCZi&i, an d Qj — Cl^y, the Equation ([6]) can be rewritten as: 

£=log|C_,|+y T C: i 1 y 

+ log + A lSi \ - qf (Ar 1 + s,)- 1 ^, (11) 

=C(-i)+C(i), (12) 

where C(—i) = log |C_j| + y T Cl*y, and 

= log m + A iSi \ - qf (A^ 1 + s,)- 1 ^, (13) 



which only depends on 7^ and B 

ac(i 



Setting = 0, we have the updating rule 



7, = ^-Tr [Br V(q*«f - ^K 1 ] • (14) 



Setting = 0, we have the updating rule 



3.2. Regularization to Bj 



Bi = sr^q.qf-s^sr 1 ^ (15) 



As noted in (2 1 , regularization to Bj is required due to limited data. It has been shown 
6] that in noiseless cases the regularization does not affect the global minimum of the cost 
function fl6]), i.e., the global minimum still corresponds to the true solution; the regularization 
only affects the probability of the algorithm to converge to the local minima. A good 
regularization can largely reduce the probability of local convergence. Although theories on 



2J were presented. 



regularization strategies are lacking, some empirical methods 

One is to model the entries in each block as a first-order Auto-Regressive (AR) process 
with the AR coefficient ?v As a result, Bj has the following form 



B J = Toeplitz([l,r,,--- ,rf' 1 }). (16) 
4 



where Toeplitz(-) is a Matlab command expanding a real vector into a symmetric Toeplitz 
matrix. Thus the correlation level of the intra-block correlation is reflected by the value of 
rj. Ti can be estimated from the cost function ([6]) directly, or can be empirically roughly 
calculated from the estimated Bj in (|T5|) as shown in Q]. We used the latter method, since it 
provides satisfactory results and saves lots of computation. According to is calculated 

by T{ — i where m\ (res. m\) is the average of entries along the main diagonal (res. the 

m 

main sub-diagonal) of the matrix Bj. This calculation cannot ensure has a feasible value, 
i.e. \ri\ < 0.99. Thus in practice, we calculate rj by 



sign^ — mm 



m, 
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,0.99} 









(17) 



Our algorithm using this regularization (I16 )) -( ll7p is denoted by BSBL-FM(l). 

In many real-world applications, the intra-block correlation in each block of a signal tends 
to be positive and high together. Thus, one can further constrain that all the intra-block 

n 

correlation values of blocks have the same AR coefficient r [2[ , 

1 9 

r = -Vr,, (18) 

where r is the average of all the rj. Then, Bj is reconstructed as 

Bj = Toeplitz([l, r, • ■ ■ , r 4 " 1 ]). (19) 
Our algorithm using this regularization f JT8|) -f n9|) is denoted by BSBL-FM(2). 
3.3. Remarks on (3 

The parameter is the noise variance in our model. It can be estimated by a number 
of methods. However, the resulting updating rule is generally not robust and requires some 
regularization as a regularizer and assign some specific 

values to it Q Similar to we select /3 = 10~ 6 in noiseless simulations, /3 = 0.1 1 1 2/ 1 1 2 i n 
general noisy scenarios (e.g. SNR < 20 dB), and = 0.01\\y\\l in high SNR scenarios (e.g. 
SNR> 20 dB). 

3 For example, one can see this by examining the published codes of the algorithms in 
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4. Experiments 

In the experiments IS^oth our BSBL-FM(l) and BSBL-FM(2) were used. Our algorithm 
was also performed in the way that it ignored intra-block correlation, i.e., fixing Bj = I(Vi) 
(denoted by BSBL-FM(O)). For comparison, two BSBL algorithms, i.e., BSBL-BO and 
BSBL-^! [2J, were used (BSBL-^! used the Group Basis Pursuit in its inner loop). 
Besides, a variational inference based SBL algorithm which also exploits the intra-block 
correlation (denoted by VBGS) was selected. It used its default parameters. Model- 
CoSaMP[4] and Block-OMpQ ( given the true sparsity) were used as the benchmark in 
noiseless situations, while the Group Basis Pursuit was used as the benchmark in noisy 
situations. 

The performance indexes were the normalized mean square error (NMSE) in noisy sit- 
uations and the success rate in noiseless situations. The NMSE was defined as ||x — 
x 9en ||2/||x 9en ||2, where x was the estimate of the true signal x gen . The success rate was 
defined as the percentage of successful trials in total experiments (A successful trial was 
defined the one when NMSE< 1(T 5 ). 

In all the experiments except for the last one, the sensing matrix was a random Gaussian 
matrix, and it was generated in each trial of each experiment. The computer used in the 
experiments had 2.5GHz CPU and 2G RAM. 

4-1. Empirical Phase Transition 

In the first experiment, we studied the phase transitions of all the algorithms in exact 



recovery of block sparse signals in noiseless situations. The phase transition curve [12| is to 
show how the success rate is affected by the sparsity level (defined as p = K/M, where K 
is the total number of non-zero elements) and indeterminacy (defined as 5 = M/N). 

The generated signal consisted of 20 blocks with the identical block size 25. The number 
of non-zero blocks varied from 1 to 10 while their locations were determined randomly. 
Each non-zero block was generated by a multivariate Gaussian distribution Af(0, S gen ) with 



4 The source codes are available: http://nudtpaper.googlecode.com/files/bsbljfm.zip 
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Figure 1: Empirical 96% phase transitions of all algorithms. Each point on the plotted phase transition 
curve corresponds to the success rate larger than or equal to 0.96. Above the curve the success rate sharply 
drops. 

S gen = Toeplitz([l, r, • • • , r 24 ]). The parameter r, which reflected the intra-block correlation 
level, was set to 0.95. The number of measurements varied from M = 50 to M = 250. 

The results averaged over 100 trials are shown in Fig. [TJ Both BSBL-FM and BSBL-BO 
showed impressive phase transition performance. We see that as a greedy method, BSBL-FM 
performed better than VBGS, Model-CoSaMP and Block-OMP. 

4-2. Performance in Noisy Environments 

In each trial we generated a signal of the length N = 512. It consisted of 32 blocks with 
the block size 16. Among them 5 blocks were non-zero. The intra-block correlation level, 
i.e., r, of each block (generated as in Section fl~Tl) was uniformly chosen from 0.8 to 0.99. The 
number of measurement was 128. The SNR, defined as SNR(dB) = 20 log 10 (||$x 9en || 2 /||n||2), 
varied from lOdB to 25dB. In this experiment we also calculated the oracle result, which 
was the least square estimate of x 9en given the true support. 

From the results (Fig. [2]), we see that the proposed algorithm when exploiting intra-block 

correlation outperformed Group Basis Pursuit and VBGS, and had close recover performance 

as BSBL-BO, BSBL-^. The BSBL-FM(l), BSBL-FM(2) and BSBL-BO even outperformed 

the oracle estimate, this may be due to that the oracle property utilized only the true 

support information while ignored the structure in signals (i.e., the intra-block correlation). 
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Figure 2: The comparison in NMSE and CPU Time of all the algorithms under different SNR levels. 
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Figure 3: The comparison in NMSE and CPU Time with varying N. 
It is worthy noticing that our algorithm had the fastest speed. 

4-3. Performance in Larger N 

This experiment was designed to show the advantage of our algorithm in speed. The 
signal consisted of 32 blocks with identical block size, five of which were randomly located 
non-zero blocks. The length of the signal, N, was varied from 512 to 2048. SNR=15 dB. 

The results (Fig. |3]) show that the proposed algorithm, although the recovery perfor- 
mance was slightly poorer than BSBL-BO and BSBL-£ 1; had the obvious advantage in speed. 
This implies that the proposed algorithm may be a better choice for large-scale problems. 
Also, by comparing BSBL-FM(l) and BSBL-FM(2) to BSBL-FM(O), we can see its recovery 
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performance was improved due to the exploitation of intra-block correlation. 



4-4- Application to Telemonitoring of Fetal Electrocardiogram 

Fetal electrocardiogram (FECG) telemonitoring via low energy wireless body-area net- 
works is an important approach to monitor fetus health state. BSBL, as an important 
branch of compressed sensing, has shown great promising in this application |2|. Using BSBL, 
one can compress raw FECG recordings using a sparse binary matrix, i.e., 

y = $x (20) 

where x is a raw FECG recording, <E» is the sparse binary matrix, and y is the compressed 
data. It have been showed [3] that using a sparse binary matrix as the sensing matrix 
can greatly reduce the energy consumption while achieving competitive compression ratio. 
Then y is sent to a remote computer. In this computer BSBL algorithms can recover the 
raw FECG recordings with high accuracy such that independent component analysis (ICA) 



decomposition 14] on the recovered recordings keeps high fidelity (and a clean FECG is 
presented after the ICA decomposition). 

□ u 

Here we repeated the experiment in Section III.B in |7j |_|. using the same dataset, 
the same sensing matrix (a sparse binary matrix with the size 256 x 512 and each column 
consisting of 12 entries of Is with random locations), and the same block partition (d, = 
24(Vi)). 

□ 

Since it has been shown [7] that non-SBL algorithms failed in this application due to the 
non-sparsity of raw FECG recordings and the less-sparsity of their representation coefficients 
in most transform domains, we only compared our algorithm BSBL-FM with VBGS and 
BSBL-BO. BSBL-BO recovered raw FECG recordings directly as shown in Q. Differently, 
BSBL-FM and VBGS first recovered the discrete cosine transform (DCT) coefficients 6 of 
the recordings according to 

y = ($D)0 (21) 



3 The experiment demo is available: https://sites.google.com/site/researchbyzhang/bsbl 
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Table 1: NMSE and the total CPU time in recovery of the whole FECG recordings. 



Average NMSE Total CPU Time(s) 
BSBL-BO 0.0028 712.9 
BSBL-FM(2) 0.0035 175.9 
BSBL-FM(l) 0.0037 178.7 
VBGS 0.1126 9670.8 



using y and <frD, where D was the basis of the DCT transform such that x = D0. Then 
they reconstructed the original raw FECG recordings according to x = D# using D and 6. 

The NMSE measured on the recovered FECG recordings is shown in Tab. [TJ We can 
see although BSBL-FM had slightly poorer recovery accuracy than BSBL-BO, it had much 
faster speed. In fact, the ICA decomposition on the recovered recordings by BSBL-FM also 
presented a clean FECG (see Fig. HJ), and the decomposition was almost the same as the 
ICA decomposition on the original recordings. In this experiment we noticed that VBGS 
took long time to recover the FECG recordings, and had the largest NMSE. Besides, the 
ICA decomposition on its recovered recordings didn't present the clean FECG. This reason 
may be due to the fact that the DCT coefficients of the raw FECG recordings are not 
sufficiently sparse, and recovering these less-sparse coefficients is very difficult for non-BSBL 



algorithms 



7] 



5. Conclusion 

The block sparse Bayesian learning (BSBL) algorithms have superior performance than 
other state-of-the-art algorithms in recovery of block sparse signals, especially when the 
signals have intra-block correlation. However, existing BSBL algorithms are not very fast. 
Thus they may not be suitable for large-scale problems. To fill this gap, this work proposed 
a fast BSBL algorithm, which also exploits the intra-block correlation. Experiments showed 
that it significantly outperforms non-BSBL algorithms, and has close recovery performance 
as existing BSBL algorithms, but is the fastest among these BSBL algorithms. 
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Figure 4: ICA decomposition of the original dataset (a), the recovered dataset by BSBL-FM(2) (b), BSBL- 
FM(1) (c) and VBGS (d) (only the first 1000 sampling points of the datasets are shown). The fourth ICs 
in (a), (b) and (c) are the extracted FECGs from the original dataset and the reconstructed datasets. The 
ICA decomposition on the recordings recovered by VBGS (d) didn't extract the FECG. 
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